智能论文笔记

MR4MR: Mixed Reality for Melody Reincarnation

Atsuya Kobayashi , Ryogo Ishino , Ryuku Nobusue , Takumi Inoue , Keisuke Okazaki , Shoma Sawa , Nao Tokui

分类：人工智能

2022-09-15

有一段漫长的历史，努力与我们周围的实体和空间探索音乐元素，例如Musique Concr \'Ete和Ambient Music。在计算机音乐和数字艺术的背景下，还设计了集中在周围物体和物理空间上的互动体验。近年来，随着设备的开发和普及，在扩展现实中设计了越来越多的作品，以创造这种音乐体验。在本文中，我们描述了MR4MR，这是一项声音安装工作，使用户可以在混合现实的背景下体验与周围空间相互作用产生的旋律（MR）。用户使用HoloLens，用户可以撞击周围环境中真实对象的虚拟对象。然后，通过遵循物体发出的声音并使用音乐生成机器学习模型进行随机变化并逐渐改变旋律的声音，用户可以感觉到其环境旋律“转世”。

translated by 谷歌翻译

I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

Chandra Bhagavatula , Jena D. Hwang , Doug Downey , Ronan Le Bras , Ximing Lu , Keisuke Sakaguchi , Swabha Swayamdipta , Peter West , Yejin Choi

分类：自然语言处理

2022-12-19

Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities. And yet, scale appears to be the winning recipe; after all, the largest models seem to have acquired the largest amount of commonsense capabilities. Or is it? In this paper, we investigate the possibility of a seemingly impossible match: can smaller language models with dismal commonsense capabilities (i.e., GPT-2), ever win over models that are orders of magnitude larger and better (i.e., GPT-3), if the smaller models are powered with novel commonsense distillation algorithms? The key intellectual question we ask here is whether it is possible, if at all, to design a learning algorithm that does not benefit from scale, yet leads to a competitive level of commonsense acquisition. In this work, we study the generative models of commonsense knowledge, focusing on the task of generating generics, statements of commonsense facts about everyday concepts, e.g., birds can fly. We introduce a novel commonsense distillation framework, I2D2, that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale models as the teacher model by two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model's own enhanced commonsense acquisition capabilities. Empirical results suggest that scale is not the only way, as novel algorithms can be a promising alternative. Moreover, our study leads to a new corpus of generics, Gen-A-Tomic, that is of the largest and highest quality available to date.

translated by 谷歌翻译

Learning Locally, Communicating Globally: Reinforcement Learning of Multi-robot Task Allocation for Cooperative Transport

Kazuki Shibata , Tomohiko Jimbo , Tadashi Odashima , Keisuke Takeshita , Takamitsu Matsubara

分类：机器人

2022-12-06

We consider task allocation for multi-object transport using a multi-robot system, in which each robot selects one object among multiple objects with different and unknown weights. The existing centralized methods assume the number of robots and tasks to be fixed, which is inapplicable to scenarios that differ from the learning environment. Meanwhile, the existing distributed methods limit the minimum number of robots and tasks to a constant value, making them applicable to various numbers of robots and tasks. However, they cannot transport an object whose weight exceeds the load capacity of robots observing the object. To make it applicable to various numbers of robots and objects with different and unknown weights, we propose a framework using multi-agent reinforcement learning for task allocation. First, we introduce a structured policy model consisting of 1) predesigned dynamic task priorities with global communication and 2) a neural network-based distributed policy model that determines the timing for coordination. The distributed policy builds consensus on the high-priority object under local observations and selects cooperative or independent actions. Then, the policy is optimized by multi-agent reinforcement learning through trial and error. This structured policy of local learning and global communication makes our framework applicable to various numbers of robots and objects with different and unknown weights, as demonstrated by numerical simulations.

translated by 谷歌翻译

Hybrid Life: Integrating Biological, Artificial, and Cognitive Systems

Manuel Baltieri , Hiroyuki Iizuka , Olaf Witkowski , Lana Sinapayen , Keisuke Suzuki

分类：人工智能

2022-12-01

Artificial life is a research field studying what processes and properties define life, based on a multidisciplinary approach spanning the physical, natural and computational sciences. Artificial life aims to foster a comprehensive study of life beyond "life as we know it" and towards "life as it could be", with theoretical, synthetic and empirical models of the fundamental properties of living systems. While still a relatively young field, artificial life has flourished as an environment for researchers with different backgrounds, welcoming ideas and contributions from a wide range of subjects. Hybrid Life is an attempt to bring attention to some of the most recent developments within the artificial life community, rooted in more traditional artificial life studies but looking at new challenges emerging from interactions with other fields. In particular, Hybrid Life focuses on three complementary themes: 1) theories of systems and agents, 2) hybrid augmentation, with augmented architectures combining living and artificial systems, and 3) hybrid interactions among artificial and biological systems. After discussing some of the major sources of inspiration for these themes, we will focus on an overview of the works that appeared in Hybrid Life special sessions, hosted by the annual Artificial Life Conference between 2018 and 2022.

translated by 谷歌翻译

Location analysis of players in UEFA EURO 2020 and 2022 using generalized valuation of defense by estimating probabilities

Rikuhei Umemoto , Kazushi Tsutsui , Keisuke Fujii

分类：机器学习

2022-11-30

Analyzing defenses in team sports is generally challenging because of the limited event data. Researchers have previously proposed methods to evaluate football team defense by predicting the events of ball gain and being attacked using locations of all players and the ball. However, they did not consider the importance of the events, assumed the perfect observation of all 22 players, and did not fully investigated the influence of the diversity (e.g., nationality and sex). Here, we propose a generalized valuation method of defensive teams by score-scaling the predicted probabilities of the events. Using the open-source location data of all players in broadcast video frames in football games of men's Euro 2020 and women's Euro 2022, we investigated the effect of the number of players on the prediction and validated our approach by analyzing the games. Results show that for the predictions of being attacked, scoring, and conceding, all players' information was not necessary, while that of ball gain required information on three to four offensive and defensive players. With game analyses we explained the excellence in defense of finalist teams in Euro 2020. Our approach might be applicable to location data from broadcast video frames in football games.

translated by 谷歌翻译

Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps

Hiroki Iida , Naoaki Okazaki

分类：自然语言处理

2022-11-08

IR models using a pretrained language model significantly outperform lexical approaches like BM25. In particular, SPLADE, which encodes texts to sparse vectors, is an effective model for practical use because it shows robustness to out-of-domain datasets. However, SPLADE still struggles with exact matching of low-frequency words in training data. In addition, domain shifts in vocabulary and word frequencies deteriorate the IR performance of SPLADE. Because supervision data are scarce in the target domain, addressing the domain shifts without supervision data is necessary. This paper proposes an unsupervised domain adaptation method by filling vocabulary and word-frequency gaps. First, we expand a vocabulary and execute continual pretraining with a masked language model on a corpus of the target domain. Then, we multiply SPLADE-encoded sparse vectors by inverse document frequency weights to consider the importance of documents with lowfrequency words. We conducted experiments using our method on datasets with a large vocabulary gap from a source domain. We show that our method outperforms the present stateof-the-art domain adaptation method. In addition, our method achieves state-of-the-art results, combined with BM25.

translated by 谷歌翻译

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Keisuke Shirai , Atsushi Hashimoto , Taichi Nishimura , Hirotaka Kameko , Shuhei Kurita , Yoshitaka Ushiku , Shinsuke Mori

分类：自然语言处理 | 人工智能

2022-09-13

我们提出了一个名为“ Visual配方流”的新的多模式数据集，使我们能够学习每个烹饪动作的结果。数据集由对象状态变化和配方文本的工作流程组成。状态变化表示为图像对，而工作流则表示为食谱流图（R-FG）。图像对接地在R-FG中，该R-FG提供了交叉模式关系。使用我们的数据集，可以尝试从多模式常识推理和程序文本生成来尝试一系列应用程序。

translated by 谷歌翻译

Nearest Neighbor Non-autoregressive Text Generation

Ayana Niwa , Sho Takase , Naoaki Okazaki

分类：自然语言处理

2022-08-26

非自动回旋（NAR）模型的计算能力比自回归模型较少，但牺牲生成质量可以生成句子。先前的研究通过迭代解码解决了这个问题。这项研究建议将最近的邻居用作NAR解码器的初始状态，并迭代编辑。我们提出了一种新颖的培训策略，以了解有关邻居的编辑操作，以改善NAR文本生成。实验结果表明，所提出的方法（邻域）在JRC-ACQUISIE EN-DE DATASET上获得了更高的翻译质量（比香草变压器高1.69点（比香草变压器高1.69点），而解码迭代率较少（少于十分之一）使用最近的邻居翻译。我们还确认了所提出的方法对数据到文本任务（Wikibio）的有效性。此外，所提出的方法在WMT'14 EN-DE数据集上优于NAR基线。我们还报告了建议方法中使用的邻居示例的分析。

translated by 谷歌翻译

HTML版本

Automatic detection of faults in race walking from a smartphone camera: a comparison of an Olympic medalist and university athletes

Tomohiro Suzuki , Kazuya Takeda , Keisuke Fujii

分类：计算机视觉 | 人工智能 | 机器学习

2022-08-24

自动故障检测是许多运动的主要挑战。在比赛中，裁判根据规则在视觉上判断缺点。因此，在判断时确保客观性和公平性很重要。为了解决这个问题，一些研究试图使用传感器和机器学习来自动检测故障。但是，与传感器的附件和设备（例如高速摄像头）相关的问题，这些问题与裁判的视觉判断以及故障检测模型的可解释性相抵触。在这项研究中，我们提出了一个用于非接触测量的断层检测系统。我们使用了根据多个合格裁判的判断进行训练的姿势估计和机器学习模型，以实现公平的错误判断。我们使用智能手机视频在包括东京奥运会的奖牌获得者中，使用了正常比赛的智能手机视频，并有意地走路。验证结果表明，所提出的系统的平均准确度超过90％。我们还透露，机器学习模型根据种族步行规则检测到故障。此外，奖牌获得者的故意故障步行运动与大学步行者不同。这一发现符合更通用的故障检测模型的实现。该代码和数据可在https://github.com/szucchini/racewalk-aijudge上获得。

translated by 谷歌翻译

Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention

Mengsay Loem , Sho Takase , Masahiro Kaneko , Naoaki Okazaki

分类：自然语言处理

2022-07-27

变压器的令人印象深刻的性能归因于自我注意力，在每个位置都考虑了整个输入之间的依赖性。在这项工作中，我们改革了神经$ n $ gram模型，该模型仅着眼于每个位置的几个周围表示，其多头机制如Vaswani等人（2017年）。通过对序列到序列任务的实验，我们表明，用多头神经$ n $ gram在变压器中替换自我注意力可以比变压器实现可比性或更好的性能。从对我们提出的方法的各种分析中，我们发现多头神经$ n $ gram是互补的，它们的组合可以进一步提高香草变压器的性能。

translated by 谷歌翻译